After exploring the general pattern of modelling GPP vs observational GPP, the next step to identify the specific period when the mismatch between modeled GPP and observed GPP in each site–>focused in the markdown file–>focus on the prelimianry selected sites by Beni
step1: tidy the table for GPP simulation vs GPP obs sites
step2: finding the way to separate out the model early simulation period
library(kableExtra)
library("readxl")
table.path<-"D:/CES/Data_for_use/"
my_data <- read_excel(paste0(table.path,"Info_Table_about_Photocold_project.xlsx"), sheet = "Only_sites_earlyGPPest")
# my_data %>%
# kbl(caption = "Summary of sites with early GPP estimation") %>%
# kable_paper(full_width = F, html_font = "Cambria") %>%
# scroll_box(width = "500px", height = "200px") #with a scroll bars
my_data %>%
kbl(caption = "Summary of sites with early GPP estimation") %>%
kable_classic(full_width = F, html_font = "Cambria")
| SiteName | Delay_status | Long. | Lat. | Period | PFT | Clim. | N | Calib. | Avai.analyzed.years-spring | Avai.site-years-spring | Avai.analyzed.years-springawinter | Avai.site-years-springawinter | Reference |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DE-Hai | Yes | 10.45 | 51.08 | 2000-2012 | DBF | Cfb | 4247 | Y | 2000-2012 | 13 | 2000-2012 | 13 | Knohl et al. (2003) |
| US-Syv | Yes | -89.35 | 46.24 | 2001-2014 | MF | Dfb | 2635 | Y | 2002-2006, 2014 | 6 | 2002, 2004-2006,2014 | 5 | Desai et al. (2005) |
| US-UMB | Yes | -84.71 | 45.56 | 2000-2014 | DBF | Dfb | 4015 | Y | 2000-2014 | 15 | 2000-2014 | 15 | Gough et al. (2013) |
| US-UMd | Yes | -84.70 | 45.56 | 2007-2014 | DBF | Dfb | 2050 | Y | 2008-2014 | 7 | 2008-2013 | 6 | Gough et al. (2013) |
| US-WCr | Yes | -90.08 | 45.81 | 1999-2014 | DBF | Dfb | 3425 | Y | 2000-2006, 2011-2014 | 11 | 2000-2006, 2011-2014 | 11 | Cook et al. (2004) |
| US-Wi3 | Yes | -91.10 | 46.63 | 2002-2004 | DBF | Dfb | 415 | NA | no years (2002, 2004 lack early doy) | 0 | no years | 0 | Noormets et al. (2007) |
| CA-Man | Yes | -98.48 | 55.88 | 1994-2008 | ENF | Dfc | 1910 | NA | 2000-2003, 2007-2008 | 6 | 2000-2003, 2007 | 5 | Dunn et al. (2007) |
| CA-NS2 | Yes | -98.52 | 55.91 | 2001-2005 | ENF | Dfc | 1123 | NA | 2002, 2004 (2003 lack early doy) | 2 | 2002 | 1 | NA |
| CA-NS4 | Yes | -98.38 | 55.91 | 2002-2005 | ENF | Dfc | 756 | NA | 2005 (2003 lack early doy) | 1 | no years | 0 | NA |
| CA-NS5 | Yes | -98.48 | 55.86 | 2001-2005 | ENF | Dfc | 1245 | NA | 2002, 2004-2005 (2003 lack early doy) | 3 | 2002, 2004 | 2 | NA |
| CA-Qfo | Yes | -74.34 | 49.69 | 2003-2010 | ENF | Dfc | 2416 | NA | 2004-2010 | 7 | 2004-2010 | 7 | Bergeron et al. (2007) |
| FI-Hyy | Yes | 24.30 | 61.85 | 1996-2014 | ENF | Dfc | 4587 | Y | 2000-2014 | 15 | 2000-2004, 2006-2014 | 14 | Suni et al. (2003) |
| IT-Tor | Yes | 7.58 | 45.84 | 2008-2014 | GRA | Dfc | 2172 | Y | 2009-2014 | 6 | 2009-2014 | 6 | Galvagno et al. (2013) |
| Sum | NA | NA | NA | NA | NA | NA | 30996 | NA | NA | 92 | NA | 73 | NA |
## [1] 13
(1) For Dfb:both for MF and DBF sites - Dfb-MF (1 site)
## [1] 6
## [1] 15
## [1] 7
## [1] 11
(2) For Dfc:both for GRA and ENF sites
## [1] 6
## [1] 6
## [1] 2
## [1] 1
## [1] 3
## [1] 7
## [1] 15
**Step1: normlization for all the years in one site**
#normalized the gpp_obs and gpp_mod using the gpp_max(95 percentile of gpp)
**Step 2:Determine the green-up period for each year(using spline smoothed values):**
#followed analysis is based on the normlized "GPP_mod"time series(determine earlier sos)
- using the normalized GPP_mod to determine sos,eos and peak of the time series (using the threshold, percentile 10 of amplitude, to determine the sos and eos in this study). We selected the GPP_mod to determine the phenophases as genearlly we can get earlier sos compared to GPP_obs--> we can have larger analysis period
- update in Aug,31,2011-->limit the sos late than Feburary(Doy:60)-->in order to remove some unrelastic sos
**Step 3:rolling mean of GPPobs and GPPmod for data for all the years(moving windown:5,7,10, 15, 20days)**
**also for the data beyond green-up period--> the code of this steps moves to second step**
- at the end, I select the 20 days windows for the rolling mean
**Step 4:Fit the Guassian norm distribution for residuals beyond the green-up period**
- The reason to conduct this are: we assume in general the P-model assume the GPP well outside the green-up period (compared to the observation data).
- But in practise, the model performance is not always good beyond the green-up period-->I tested three data range:
a. [peak,265/366]
b. DoY[1, sos]& DOY[peak,365/366]
c. [1,sos] & [eos,365/366]
I found the using the data range c, the distrbution of biase (GPP_mod - GPP_obs) is more close to the norm distribution, hence at end of I used the data range c to build the distribution.
**step 5:determine the "is_event" within green-up period**
- After some time of consideration, I took following crition to determine the "is_event":
1) during the green-up period (sos,peak)-->the data with GPP biases bigger than 3 SD are classified as the "GPP overestimation points"
2) For "GPP overestimation points" --> only regard the data points in the first 2/3 green-up period as the "is_event"
3) For "is_event points", thoses are air temparture is less than 10 degrees will be classified as the "is_event_less10". I selected 10 degree as the crition by referring to the paper Duffy et al., 2021 and many papers which demonstrate the temperature response curve normally from 10 degree (for instance: Lin et al., 2012)
References:
Duffy et al., 2021:https://advances.sciencemag.org/content/7/3/eaay1052
Lin et al., 2012:https://academic.oup.com/treephys/article/32/2/219/1657108
**step 6:Evaluation "is_event"-->visualization and stats**
- two ways to evaluate if "is_event" is properly determined:
1) visulization
2) stats:
$$
Pfalse = /frac{days(real_{(is-event)})}{days(flagged_{(is-event)})}
$$